Computing Temporal Trends in Web Documents

نویسنده

  • Mark Last
چکیده

Most existing methods of web content mining assume a static nature of the web documents. This approach is inadequate for long-term monitoring and analysis of the web content, since both the users' interests and the content of most web sites are subject to continuous changes over time. In this research, we are interested in developing computationally intelligent and efficient text mining techniques that will enable continuous comparison between documents provided by the same source (website, institute, organization, cult, author etc.) or viewed by the same group of users (e.g., university students) and timely detection of temporal trends in those documents. Our approach builds upon the recently developed methodology for fuzzy comparison of frequency distributions. The proposed techniques are evaluated on a real-world stream of web traffic.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing New Trends for Persian CAPTCHA

To distinguish between human user and computer program to enhance security, a popular test called CAPTCHA is used on Web. CAPTCHA has an important role in preventing Denial Of Service (DOS) attacks in computer networks. There are many different types of CAPTCHA in different languages. Due to the expansion of Persian-language and documents on internet, creating a suitable Persian CAPTCHA seems t...

متن کامل

Temporal ranking for fresh information retrieval

In business, the retrieval of up-to-date, or fresh, information is very important. It is difficult for conventional search engines based on a centralized architecture to retrieve fresh information, because they take a long time to collect documents via Web robots. In contrast to a centralized architecture, a search engine based on a distributed architecture does not need to collect documents, b...

متن کامل

The design, implementation, and performance of the V2 temporal document database system

It is now feasible to store previous versions of documents, and not only the most recent version which has been the traditional approach. This is of interest in a number of application, both temporal document databases as well as web archiving systems and temporal XML warehouses. In this paper, we describe describe the architecture and the implementation of V2, a temporal document database syst...

متن کامل

Concepts of Bitemporal Database Theory and the Evolution of Web Documents

A vast amount of temporal information is provided on the web. Even though many facts expressed in documents are time-related, the temporal properties of web presentations have not received much attention. In database research, temporal databases have become a mainstream topic in recent years. In web documents temporal data may exist as meta data in the header and as user-directed data in the bo...

متن کامل

Analyzing the Collaboration Network of Global Scientific Outputs in the Field of Bibliotherapy in the Web of Science Database

Background and Aim: Bibliotherapy is a useful treatment for the prevention and treatment of mental disorders and has led to the formation of many scientific publications in this field. The purpose of this study was to investigate the publication trends in the field of bibliotherapy and visualize the structure of its scientific collaborations based on the Web of Science database during the perio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005